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Abstract.  We  present  an  Adaptive  Parametrized-Background  Data- Weak  (APBDW)  approach  to 
the  variational  data  assimilation  (state  estimation)  problem.  The  approach  is  based  on  the  Tikhonov 
regularization  of  the  PBDW  formulation  [Y  Maday,  AT  Patera,  JD  Penn,  M  Yano,  Int  J  Numer  Meth 
Eng,  102(5),  933-965],  and  exploits  the  connection  between  PBDW  and  kernel  methods  for  regression. 
An  adaptive  procedure  is  presented  to  handle  the  experimental  noise.  A  priori  and  a  posteriori  esti¬ 
mates  for  the  L2  state-estimation  error  motivate  the  approach  and  guide  the  adaptive  procedure.  We 
present  results  for  two  synthetic  model  problems  to  illustrate  the  elements  of  the  methodology.  We  also 
consider  an  experimental  thermal  patch  configuration  to  demonstrate  the  applicability  of  our  approach 
to  real  physical  systems. 

Resume.  Nous  presentons  une  approche  Adaptive-Parameterized-Background  Data- Weak  (APBDW) 
pour  le  probleme  d’assimilation  de  donnees  variationnelles.  L’approche  est  fondee  sur  la  regularisation 
Tychonoff  de  la  formulation  PBDW  [Y  Maday,  AT  Patera,  JD  Penn,  M  Yano,  Int  J  Numer  Meth  Eng, 
102(5),  933-965],  et  consiste  en  une  procedure  adaptative  pour  considerer  le  bruit  experimental.  Des 
estimations  a  priori  et  a  posteriori  de  l’etat  L2  (estimation  d’erreur)  motivent  l’approche  et  servent 
de  guide  a  la  procedure  adaptative.  Nous  presentons  des  resultats  pour  deux  problemes  de  modele 
synthetique  pour  illustrer  les  elements  de  la  methodologie.  Nous  considerons  aussi  une  configuration 
experimental  de  patch  thermique  pour  montrer  que  notre  approche  est  applicable  dans  le  cadre  de 
systemes  physiques. 
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1.  Introduction 

Data  assimilation  refers  to  the  estimation  of  the  state  utrue  of  a  physical  system  over  the  domain  of  interest 
12  C  by  combining  experimental  data  with  a  mathematical  model  of  the  dynamics  of  the  system.  For  real¬ 
time  and  in  situ  applications,  data  assimilation  techniques  should  provide  an  estimate  of  the  state  rapidly  with 
little  or  no  communication  with  extensive  offline  resources.  Furthermore,  for  safety  reasons,  it  is  key  to  certify 
the  reliability  of  our  estimate  using  either  probabilistic  (i.e. ,  confidence  intervals)  or  deterministic  (i.e. ,  error 
bounds)  approaches. 
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The  goal  of  this  work  is  to  develop  a  variational  data  assimilation  procedure  that  combines  a  parameterized 
best-knowledge  mathematical  model  and  experimental  data  to  rapidly  obtain  a  reliable  estimate  of  the  state 
utrue  £  U  over  fl.  We  denote  by  {yrn}m-i  the  set  of  experimental  measurements,  and  we  denote  by  ubk(n)  £  U 
the  solution  to  our  parameterized  best-knowledge  mathematical  model  for  the  parameter  value  p,  £  V.  Here,  the 
space  U  =  U(Cl)  is  a  suitable  Hilbert  space  defined  over  H,  while  V  C  Mp  reflects  the  uncertainty  in  the  value 
of  the  parameters  associated  with  the  model.  Since  experimental  apparatuses  are  typically  affected  by  errors, 
measurements  are  in  general  of  the  form  ym  =  £m(utrue)  +  em,  where  £m  :  U  H >  K.  is  a  known  functional  and 
em  reflects  the  observational  noise.  On  the  other  hand,  the  uncertainty  in  the  parameters  of  the  model  leads  to 
the  definition  of  the  best-knowledge  manifold  Mbk  :=  {ubk(p)  :  p  £  V]  C  U,  which  collects  the  solution  to  the 
parameterized  best-knowledge  model  for  all  values  of  the  parameter  in  V. 

In  [24,25],  Maday  et  al.  introduced  the  so-called  Parameterized-Background  Data- Weak  (PBDW)  approach. 
The  key  idea  of  the  PBDW  formulation  is  to  seek  an  approximation  u*  =  z*  +  rf  to  the  true  held  utrue 
employing  projection-by-data.  The  first  contribution  to  u*,  z*  £  Zjv,  is  the  "deduced  background  estimate.” 
The  linear  iV-dimensional  space  Z n  C  U  is  informed  by  the  best- knowledge  manifold  A4bfc,  which  we  hope 
is  close  to  the  true  held.  The  second  contribution  to  u*,  rj*  £  Um,  is  the  "update  estimate”.  The  linear  M- 
dimensional  space  Um  is  the  span  of  the  Riesz  representations  of  the  M  observation  functionals  {£m}m=i  ■  While 
the  background  estimate  incorporates  our  a  priori  knowledge  of  the  state,  the  update  addresses  the  dehciencies 
of  the  best-knowledge  model  by  improving  the  approximation  properties  of  the  search  space.  Projection-by¬ 
data,  as  opposed  to  projection-by-model,  implies  that  the  parameterized  model  is  not  directly  used  during  the 
data  assimilation  procedure.  This  feature  significantly  simplifies  the  computational  procedure  and  guarantees 
real-time  responses. 

In  this  work,  we  present  an  adaptive  Parameterized-Background  Data- Weak  (APBDW)  approach  that  ex¬ 
tends  the  original  PBDW  formulation  to  the  case  of  pointwise  noisy  measurements;  £m  :=  5Xrn  for  some  xm  £  H, 
m  =  1, , . . ,  M.  Our  approach  is  based  on  the  Tikhonov  regularization  of  the  PBDW  formulation  and  relies  on 
an  adaptive  procedure  for  the  selection  of  the  hyper-parameters.  The  adaptive  procedure  chooses  the  hyper¬ 
parameters  that  minimize  an  estimate  of  the  L2  state-estimation  error  on  a  validation  dataset.  The  extension  to 
pointwise  measurements  is  based  on  the  theory  of  Reproducing  Kernel  Hilbert  Spaces  (RKHS,  [1])  and  exploits 
the  connection  between  the  PBDW  formulation  and  kernel  methods  for  regression  (  [9,30])  We  also  present  a 
priori  and  a  posteriori  error  estimates  for  measurements  affected  by  either  systematic  or  homoscedastic  random 
error.  These  estimates  motivate  the  approach  from  a  theoretical  standpoint,  and  guide  the  adaptive  procedure. 

Our  approach  shares  some  key  features  with  a  number  of  existing  techniques  from  the  statistical  learning  and 
data  assimilation  literature.  In  the  statistical  learning  literature,  our  approach  is  equivalent  to  the  approach 
presented  in  [16]  by  Kimeldorf  and  Wahba.  However,  in  their  paper,  the  authors  did  not  relate  the  background 
space  Zpj  to  the  solution  manifold  associated  with  a  parameterized  PDE  and  they  did  not  discuss  how  to 
practically  build  the  space  based  on  the  available  prior  knowledge  about  the  state.  In  the  data  assimilation 
literature,  our  work  is  related  to  the  approach  presented  by  Bennett  in  [2,3],  and  to  the  so-called  3D- VAR, 
first  introduced  by  Lorenc  [21,22]  for  steady  problems,  then  extended  to  time  dependent  problems  with  the 
name  of  4D-VAR  (  [8])  and  more  recently  coupled  with  model  order  reduction  techniques  (  [6,31])  to  reduce  the 
computational  costs.  Although  the  formulations  proposed  by  Bennett  and  Lorenc  are  informed  by  the  solution 
to  a  best-knowledge  differential  model,  both  approaches  are  not  naturally  designed  to  include  a  parameterized 
background  space  in  the  formulation.  The  a  priori  error  analysis  can  be  seen  as  a  generalization  of  the  work 
of  Krebs  et  al.  [18],  while  the  a  posteriori  error  analysis  is  related  to  [29].  Finally,  the  adaptive  procedure 
employed  in  this  work,  which  is  known  in  statistics  as  holdout  validation  (see,  e.g.,  [15,  Chapter  7]  and  [17]  ), 
is  strongly  connected  to  the  approach  proposed  in  [25,  section  5.8]  to  improve  the  approximation  properties  of 
the  background  space  Zm  for  sequential  data  assimilation. 

The  outline  of  the  paper  is  as  follows.  In  section  2,  we  present  the  formulation  and  the  well-posedness 
analysis.  We  further  relate  our  formulation  to  a  number  of  other  methods  proposed  in  the  data  assimilation 
and  statistical  learning  literature.  Then,  in  section  3,  we  present  a  priori  and  a  posteriori  estimates  for  the 
L2(tt)  state-estimation  error.  In  section  4,  we  exploit  the  error  analysis  to  design  an  adaptive  procedure  for  the 
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selection  of  the  parameters.  In  section  5,  we  present  numerical  results  for  two  synthetic  problems.  Finally,  in 
section  6,  we  present  the  results  for  a  physical  thermal  patch  problem. 

2.  Formulation 


2.1.  Preliminaries 

By  way  of  preliminaries,  we  introduce  notation  used  throughout  the  paper.  Given  the  Lipschitz  domain 

C  Rd,  we  introduce  the  Reproducing  Kernel  Hilbert  Space  (RKHS)  U  endowed  with  the  inner  product  (•,•) 
and  the  induced  norm  ||  •  ||  =  sj  (•,  •).  We  further  denote  by  I\  :Ux!li->R  the  reproducing  kernel  associated  with 
14,  which  is  assumed  to  be  known  explicitly.  Finally,  for  any  closed  subspace  Q  C  U,  we  denote  by  ng  :  14  i-a  Q 
the  orthogonal  projection  operator  onto  Q,  and  we  denote  by  Q1-  its  orthogonal  complement. 

We  now  introduce  the  dataset  of  experimental  observations  considered  in  this  work.  We  consider  the  dataset 
'Dm  ■=  {(xm,  Um)}m=i  where  Xm  •=  {x\,  ■  ■  ■  ,Xm}  C  fl  represents  the  set  of  observation  points,  while  Fm  := 
{y i, . . .  ,Dm}  C  K  are  approximations  of  the  true  states  in  the  observation  points,  ym  «  utrue( xm). 

We  briefly  provide  the  definition  of  RKHS  and  we  recall  a  number  of  properties  exploited  in  this  work.  For 
the  purpose  of  this  paper,  a  RKHS  is  an  Hilbert  space  such  that  the  evaluation  functional  Sx  belongs  to  the  dual 
space  of  U  for  any  x  £  O.  If  we  denote  by  (j>  :  x  G  O  i-x  <j>{x)  £  U  the  feature  map  associated  with  U  such  that 
((j)(x),f)  =  f(x)  for  any  x  £  fl,  the  Reproducing  Kernel  associated  with  U  is  given  by  K(x,y)  =  (<j)(x),  (/>(y))- 
It  is  easy  to  verify  that  I\  is  symmetric,  I<(x,y)  =  K(y,  x)  for  all  x,y  £  ft.  It  is  also  easy  to  verify  that  K  is 
positive  definite  that  is 

N 

T,  CiCjK(xj,Xi )  >0 
i,j— 1 

for  all  distinct  {xi}^Lt  C  ft,  for  all  c  £  \  {0}  and  for  any  N  >  0. 

Recalling  Moore- Aronszajn  theorem  (  [1]),  there  exists  a  duality  between  symmetric  positive  definite  kernels 
and  RKHS:  we  have  indeed  that  given  a  symmetric  positive  definite  (SPD)  kernel  K  there  exists  a  RKHS  for 
which  K  is  the  reproducing  kernel.  This  RKHS  is  referred  to  as  native  space  of  K .  We  can  thus  first  choose  an 
explicit  SPD  kernel  and  then  exploit  Moore- Aronszajn  theorem  to  recover  the  variational  interpretation. 


2.2.  Adaptive  PBDW  statement 


We  introduce  some  definitions.  First,  we  define  the  iV-dimensional  background  space  Zn  =  span{i^n}^r_1  C  U. 
The  space  Zn  encodes  our  prior  knowledge  about  the  state  utrue ;  we  refer  to  section  2.5.1  for  practical  strategies 
for  the  construction  of  the  background  space.  Given  the  dataset  Dm,  we  further  introduce  the  empirical  risk 
Fvl  :  14  1 — t  M: 


1  M 

VM(U)  =  y  (u(xm)  -  ym)2- 


(1) 


We  have  now  the  elements  to  state  the  Adaptive  Parameterized-Background  Data- Weak  (APBDW)  formu¬ 
lation:  given  £  >  0,  find  £14  such  that 


uj  =  argmin  J£(u)  :=  ^IIIUluII2  +  VM{u). 

u&A  ™ 


(2) 


To  simplify  notation,  in  the  state  estimate  u|  we  omit  the  subscripts  associated  with  the  discretization  param¬ 
eters  M  and  N . 


2.3.  Well-posedness  analysis 

Proposition  2.1  provides  a  sufficient  condition  for  the  well-posedness  of  problem  (2),  and  an  explicit  low¬ 
dimensional  representation  for  the  solution  that  allows  real-time  calculations.  The  result  was  first  proved  by 
Kimeldorf  and  Wahba  in  [16]  in  a  slightly  different  form.  Proposition  2.4  shows  that  problem  (2)  can  be 
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reformulated  as  a  two-field  minimization  statement.  The  latter  result  will  serve  to  generalize  our  statement  to 
the  case  in  which  the  background  does  not  belong  to  U. 

Proposition  2.1.  LetUM  span{KXrn}^=l  and  let  Pn.m  be  defined  as 


Pn,m 


inf 

z£Zn 


sup 

vEUm 


(z,v) 

MINI' 


(3) 


Then,  if  /3n,m  >  0,  the  solution  u [  to  (2)  exists  and  is  unique.  Furthermore,  the  solution  to  (2)  is  of  the 
form 

u}(x)  =  r]t+z^,  (4) 

where  r]t  £  Um  H  Z ^  and  £  Z jy. 

In  view  of  the  proof  of  Proposition  2.1,  we  state  and  prove  two  lemmas. 

Lemma  2.2.  (  [36])  LetUM  '■=  span{KXm}^_  1  and  let  /3n,m  be  defined  as  in  (3).  Then,  we  have  that 


Pn,m 


(5) 


Proof.  To  simplify  notation,  given  the  linear  space  Q ,  we  define  Q ^  =  {q  £  Q  :  ||g||  =  1}.  We  now  prove  (5). 
Pn,m  =  (infze^>  SUP w6Mci)  (*,«))  =  infze2U)  ¥h(Mz\\2  =  1  -  sup^<i>  ||n^||2 

=  1  -  (suP,e^  SUP (z>  9))  =  1  -  (  suP«€Mi(I)  (*’  9) ) 

=  1  -  suP,ewi{1)  WU^l\\2  =  Hn^9ll2 


Thesis  follows.  □ 

Lemma  2.3.  LetUM’  '■=  sPan{Kxm}m=i>  M '  <  M.  Let  us  introduce  j. 3n,m '  =  inf zezN  sup„e^if/  j^rrA-,  and 
the  matrix  )  £  KM  ,M  ,  =  K(xm,xm>).  Let  us  further  define 


cn,m  '■=  max  Cn,m>, 


Cn,m '  =  mm 


,(K<M'>), 


2  +  Amj„(IK(M')) 


0N, 


M' 


(6a) 


where  A mj„(IK(M  ))  denotes  the  minimum  eigenvalue  of  the  matrix  K^M  ). 

Then,  the  following  bound  holds: 

M 

J(u)  =  \\Uz±u\\2  +  ^2  u{xm)2  >  cNtM\\u\\2,  \/u£U.  (6b) 

m—1 

Proof.  We  first  claim  that  for  any  M'  such  that  Pn,m'  >  0  we  have 

M' 

JM'(u)  =  \\nz±u\\2  +  ^2  u(xm)2  >  cNtM'\\u\\2,  Vu£U.  (7) 

m—1 

Given  (7),  we  find  that 

J(u)  >  Jm'{u)  >  cn,m’\\u\\2  VM'  <  M  =>  J(u)  >  (max  cn,m || w||2, 
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which  is  the  thesis. 

We  now  show  (7).  Given  u  G  U,  we  introduce  Ui  =  IIwx  u,  u2  =  II uM,u  =  Em=i(u2)m^i»'  Then,  we 
observe  that 

(8) 


Ul(Xm)  =  (KXm,  Ml)  =0,  TO  =  1,  .  .  .,M'. 
€UM' 


We  further  observe  that 


M' 


which  implies  that 


T.  u2(xm)2  =  ||K(M  V2IH,  ||m2||2  =  K(m  ]  u2, 

||K(m')u2| 


.  u2{xm)2 

mm  — — — rr — nr; -  =  mm 


eRM'  K(Md  u2 


Combining  (8)  and  (9),  we  obtain 


M'  M' 

y  u{xm)2  =  y  u2(xm)2  >  Ami„(K(M,))  ||m2| 


m= 1 


m—1 


(9) 


Now,  recalling  the  identity  2 ab  >  —  —  eb2  valid  for  any  e  >  0,  and  Lemma  2.2,  we  obtain: 

Jm'W)  =  Jm'(ui  +  u2)  >  ||ii2j_mi||2  +  ||ii2;j_m2||2  +  2(nzx«1,nzx«2)  +  Amin(K(M'))  ||m2||2 
>  a-e)P2NM'\\ni\\2  +  (1  -  7)  l|nzxM2||2  +Ami„(K^'))||M2||2 

Let  us  consider  e  G  ^ — ^fyuoy- l)  ■  Recalling  that  ||II2xm2||  <  ||m2||,  we  obtain 

Jm'(u )  >  (1  -  z)0n,m>  ||mi||2  +  ^Amin(K(M'))  +  1  -  \ j  ||m2||2 

>  min  (Ami„(KlM'))  +  1  -  (1  -  z)0n,m>)  (IM|2  +  IKf)  • 

HMI2 

Estimate  (7)  follows  by  considering  e  =  2+A  .2(E(m'))-  □ 

We  observe  that  Cn,m  is  monotonic  increasing  with  M;  therefore,  it  is  asymptotically  bounded  from  below 
in  the  limit  M  — >  00. 

Proof.  (  Proposition  2.1)  Applying  Lemma  2.3,  we  find  that  the  objective  function  :  U  1— >  K.  is  strictly  convex. 
Existence  and  uniqueness  then  follow  from  [10,  Theorem  3,  Chapter  8.2]. 

We  now  show  the  decomposition  (4).  If  we  define  r/c  =  II^xm^  and  =  IIznM|,  this  corresponds  to  show 
that  ry|  G  Um ■  Recalling  that  Vm{u )  =  Vm(IIMmm)  for  all  «GM,  we  find 

=  +  z$)  —  MUuMVi  +  z%)- 

Thus,  since  is  the  unique  minimizer,  we  must  have  II uMV^  =  rl^ ■  Thesis  follows.  □ 

Proposition  2.4.  Suppose  that  Zn  C  U  and  /3n,m  >  0.  Then,  rj |  and  defined  in  (4)  solve  the  following 
problem: 

,  min  :=£\\r]\\2 +  VM(i)  +  z).  (10) 

(■ ri,z)eUxZN  s 
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Proof.  Let  (rj,z)  G  U  x  Zjv-  Then,  we  have 


J^2)(V,z)  =  f  (\\nz±r]\\2  +  Hllz^ll2)  +VM(JlzNr]  +  U2:±r]  +  z) 

=  f\\^zNv\\2  +  J(^2)  (Uz±rj,UzNV  +  z)  >  J^\nz±rj,IlzNV  +  z). 


Therefore,  we  have  that 


min  Jf(r),z)=  min  J)~'(r],z) 
(■n,z)eUxzN  ?  '  (v,z)ezfcxzN  c 


(2  i 

Thesis  follows  by  observing  that  J^(u)  =  J ^  (J\.z±u,HzNu)-  □ 

f  n) 

We  observe  that  the  two-field  functional  J ^  (rj,z )  is  well-defined  for  Zn  C  C(fi).  We  thus  expect  that  it 
is  possible  to  derive  sufficient  conditions  for  the  well-posedness  of  (10)  without  assuming  that  Zjv  C  U.  We 
address  this  issue  in  section  2.4. 

2.4.  Algebraic  formulation 

We  now  discuss  how  to  solve  problem  (2)  numerically.  With  this  in  mind,  we  first  introduce  the  matrices 
K  G  RM’M,  Z  G  RN’N,  L  G  Rm’n  such  that 

—  (Aa,m ,  =  A(xm,  Xm'),  Tjnn>  =  ((n,  Cn')i  Rm,n  i-^-xrn  i  Cn)  =  Cn(Xm)~  (H) 

Proposition  2.5  provides  the  algebraic  formulation  of  the  APBDW  statement  (2). 

Proposition  2.5.  Assuming  (3n,m  >  0,  the  solution  u |  to  (2)  is  given  by 

M  N 

rfrnKxm{X)  +  ZnCn(x).  (12) 


the  pair  (■ r] *,  z*)  G  RM  x  solves 


(£M I  +  K)  r/*  +  Lz*  =  y 
Lt  r/*  =  0. 


In  anticipation  of  the  proof,  we  recall  a  standard  result  (see,  e.g.,  [26,  Section  1.3.5]). 

Lemma  2.6.  The  inf-sup  constant  Pn,m  is  the  square  root  of  the  minimum  eigenvalue  of  the  following  eigen- 
problem: 

Lt  K_1  L  zn  =  vn  Z  z„,  n=l,...,N.  (14) 

Proof.  Since  supveUM  W  =  \\HUm  z\\,  we  obtain: 


Pn,m  =  inf  sup 

zGZn  V£Um 


=  inf 

zezN  \\z 


We  observe  that  for  any  z  G  Zn  the  projection  onto  Um  can  be  written  as  PtuMz  =  5Zm=i  ’tm  KXm ,  where  the 
vector  rjz  satisfies  r/z  =  K_1  Lz.  Therefore,  we  find 


Pn,m  ~  inf 


iT  Lt  K_1  L: 


61"  ZTZz 
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Introducing  the  Lagrangian  multiplier  i/£K,  we  can  write  the  optimality  conditions  as 

(  LT  IK-1  Lz  —  uZz  =0 
{  zT  Zz  =1 


Thesis  follows.  □ 

Proof.  (Proposition  2.5)  It  is  straightforward  to  observe  that  we  can  restate  the  minimization  statement  (2)  as 

MUV  =  IIKt7  +Lz  ~  y||!>  subject  to  LT  77  =  0, 

where  the  constraint  imposes  that  Then,  by  introducing  the  Lagrangian  multiplier  v  £  M.N ,  we  can 

derive  the  optimality  system: 

(PC  +  ±K2)  77*  +  ^KLz*  +  Lu  =  ^Ky 

LT  IK  77*  +  LTLz*  =  LTy  (15) 

LT  77*  =0 

By  pre-multiplying  (15)i  by  LTK_1,  we  obtain 

Lt  K  77*  +  LtLz*  +  MLTK~lLv  =  LTy. 

Since  LTK_1L  is  full-rank  (it  follows  from  the  assumption  /3at;m  >  0  and  Lemma  2.6),  by  comparing  with  (15)2, 
we  must  have  v  =  0.  As  a  result,  we  can  restate  (15),  as 

(£MI  +  K)  77*  +  Lz*  =  y 

LtKt7*+LtLz*  =LTy  (16) 

LT  77*  =0 

Finally,  we  observe  that  (16)2  follows  from  (16)i  and  (16)3.  We  have  indeed 

Lt  (K77*  +  Lz*  -  y)  =  Lt  ((K  +  £MI)  77*  +  Lz*  -  y)  =  0. 


Thesis  follows.  □ 

We  now  state  an  important  corollary.  Proof  is  straightforward  and  is  here  omitted. 

Corollary  2.7.  Problem  (2)  admits  an  unique  solution  of  the  form  (12)  if  and  only  if  the  matrix  L  has  rank 
equal  to  N. 

Some  comments  are  in  order.  First,  Corollary  2.7  corresponds  to  the  original  result  proved  by  Kimeldorf  and 
Wahba  (  [16,  Theorem  5.1]).  Second,  Corollary  2.7  provides  an  actionable  condition  to  verify  the  well-posedness 
of  Problem  (2).  We  further  observe  that  this  condition  does  not  rely  on  the  assumption  that  the  background 
belongs  to  U.  This  observation  motivates  the  next  result. 

Proposition  2.8.  Suppose  that  Z n  C  C(fi)  and  let  rank( L)  =  N.  Then,  the  solution  to  (10)  exists,  is  unique 
and  is  of  the  form  (12)  with  (77*,  z*)  G  RM  x  satisfying  (13). 
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Proof.  We  denote  by  775,  z $  the  solution  to  (10).  We  first  observe  that 

Vm(i?£  +  %)  =  Vm(IW%  +  %). 


Therefore,  any  solution  to  (10)  is  of  the  form  (12).  Substituting  in  the  minimization  statement,  we  obtain  the 
following  algebraic  minimization  problem  for  (77*  ,  z*)  e  Rm  x 


min  £  »7T  IK  77  + 

(77*,z*)eRMxRN 

By  deriving  the  stationary  conditions,  we  find 


1 

M 


||Kt7  +  L  z  —  y 


2 

2- 


(PC+^K2)  7,*  +  iKLz*  =^Iy 

Lt  K  77*  +  LtLz*  =LTy 


(17) 


By  premultiplying  (17)i  by  M K  1,  we  find 


(£M I  +  K)  77*  +  Lz*  =  y. 

If  we  now  premultiply  the  latter  equation  by  LT  and  we  subtract  (17)2,  we  obtain 

LT77  =  0. 


(18a) 


(18b) 


Saddle  system  (18a)  -  (18b)  is  well-posed  since  K  is  invertible  and  L  is  full-rank  by  hypothesis.  Thesis  follows.  □ 
2.5.  Construction  of  the  spaces 

2.5.1.  Background  spaces  Zn 

In  many  engineering  applications,  mathematical  models  -which  in  mechanics  typically  consist  of  partial 
differential  equations  (PDEs)  -  have  been  developed  to  describe  the  behavior  of  the  physical  state;  these  models 
typically  depend  on  a  set  of  parameters  71  £  Rp  representing  material  properties,  geometry,  operating  conditions 
etc.  In  practical  applications,  such  parameters  are  uncertain  and  thus  only  a  confidence  region  V  C  Rp  is 
available.  As  a  result,  provided  that  the  model  is  reasonably  accurate,  the  true  field  utrue  is  close  (in  a  suitable 
functional  norm)  to  the  solution  ubk(p*)  to  the  parametrized  best-knowledge  model  for  some  unknown  parameter 
p*  £  V . 

Following  the  idea  proposed  in  [24],  in  this  work,  we  define  the  background  space  Zn  such  that  for  each 
parameter  71  €  V  the  solution  to  the  PDE  model  ubk{p)  can  be  accurately  represented  by  an  element  of  the 
space.  As  a  result,  assuming  that  utrue  «  ubk(p*)  for  some  fi*  £  V,  then  there  exists  an  unknown  element 
Zn  £  Zn  such  that  utrue  ~  Zn- 

The  problem  of  determining  a  low-dimensional  space  Zn  that  accurately  represents  the  PDE  solution  man¬ 
ifold  {ubk(p)  :  n  £  V}  has  been  extensively  studied  in  the  last  few  decades  in  the  Model  Order  Reduction 
literature.  The  choice  of  the  particular  algorithm  to  build  Zn  depends  on  the  structure  of  the  equation  and  on 
the  parameterization.  In  this  work,  following  [24],  we  employ  the  weak-Greedy  algorithm  (  [28,  Section  7.2.2]). 
The  method  properly  selects  N  parameters  {Hn}n=i  c  'P  and  sets  -Z jv  =  span{ubfc(7t*)}()r=1. 

Some  comments  are  in  order.  In  our  framework,  the  mathematical  model  is  only  used  to  build  the  background 
space  Zn  and  is  not  directly  employed  during  the  data  assimilation  procedure.  In  addition,  the  PBDW  statement 
does  not  depend  on  how  the  space  Zn  is  built.  These  two  observations  provide  us  with  some  flexibility  in  choosing 
the  background  space.  We  could  for  instance  first  define  the  space  ZN  using  a  parameterized  best-knowledge 
model,  and  then  augment  the  space  exploiting  historical  data.  The  latter  observation  could  be  attractive  for 
applications  in  which  reliable  historical  data  are  available.  In  this  work,  we  do  not  pursue  this  feature  of  our 
approach;  we  refer  to  [25]  for  a  detailed  discussion  about  this  issue. 
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2.5.2.  Choice  of  the  Reproducing  Kernel 


In  approximation  theory,  several  choices  of  the  kernel  function  K  have  been  proposed  and  analyzed  both 
from  the  theoretical  and  the  numerical  perspective.  We  refer  to  [34]  and  to  the  references  therein  for  a  detailed 
discussion.  Here,  we  only  introduce  the  class  of  kernels  employed  in  the  numerical  simulations. 

In  this  work,  we  consider  compactly  supported  radial  basis  functions  of  minimal  degree  (csRBFs,  [34]),  also 
known  as  Wendland  functions.  This  class  of  kernels  is  defined  as  Kx(x,y)  =  <j>d,k{ l\\x  —  2/|| 2)  where  x  =  [k,l] 
and 


<t>d,,k{r ) 


Pd,k(r )  0  <  r  <  1; 

0  r  >  1. 


(19a) 


The  polynomial  pd,k  has  the  following  form  for  k  =  0,1  and  for  all  d: 


(  (1  —  rYd’k  k  =  0 

\  (1  —  rYd’k+1  ((£d,k  +  1)t  +  1)  k  =  1 


(19b) 


and  ld,k  =  LIJ  +  k  +  1.  We  observe  that  it  is  possible  to  generalize  (19b)  to  the  more  general  case  k  G  N;  we 
refer  to  [34,  Table  9.1]  for  the  explicit  formulas. 

The  next  result  clarifies  the  connection  between  csRBF  and  Sobolev  spaces.  We  refer  to  [34,  Theorem  10.35] 
for  the  proof. 

Proposition  2.9.  Let  us  consider  the  compactly  supported  RBF  Kx,  Kx(x,y)  =  <j>d,k{'y\\x  ~  V\U)>  introduced 
in  (19).  Let  LI  =  and  let  either  one  of  these  conditions  hold: 

(1)  d  >  3,  k  >  0; 

(2)  d  >  1,  k  >  0. 

Then,  the  native  space  for  Kx  is  the  Sobolev  space  H(d+1l/2+k(R.d). 

Some  comments  are  in  order.  By  restricting  ourselves  to  csRBF  kernels,  the  choice  of  the  inner  product 
reduces  to  the  choice  of  the  parameters  x  =  [fc>7]-  We  refer  to  section  4  for  a  thorough  discussion  about 
their  choice.  We  further  observe  that,  by  resorting  to  these  kernels,  we  lose  the  possibility  of  imposing  strong 
(Dirichlet)  boundary  conditions  at  the  boundary  of  LI  in  the  variational  formulation. 


2.6.  Connection  with  other  formulations 

2.6.1.  Connections  with  deterministic  formulations 

For  suitable  choices  of  the  hyper-parameters,  our  formulation  reduces  to  the  Generalized  Empirical  Interpo¬ 
lation  Method  (GEIM,  [23]),  to  the  original  PBDW  formulation  presented  in  [24]  and  to  least-square  regression 
(LSR).  If  we  choose  N  =  M,  then  u*  =  z *  corresponds  to  the  solution  to  the  GEIM  formulation  for  pointwise 
measurements.  To  prove  the  other  two  equivalences,  we  first  introduce  the  formulations: 

(1)  (PBDW)  find  u*  =  rj*  +  z *  such  that 

(??*,  z*)  :=  arg  min  ||?7||  subject  to  r](xm)  +  z(xm)  =  ym,  m  =  1, . . . ,  M;  (20) 

(■ ri,z)eUxZN 

(2)  (LSR)  find  z*LS  G  Zn  such  that 


zls  ■=  arS  min  Vm{z).  (21) 

z£Zn 

When  Zn  is  built  using  a  Proper  Orthogonal  Decomposition  (POD,  [19]),  statement  (21)  corresponds 
to  Gappy-POD  (  [11,35]). 

We  now  show  the  equivalence. 
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Proposition  2.10.  Let  (3n,m  >  0.  Let  =  17J  +  be  the  solution  to  (2).  Then,  we  have 

lim  llu*  —  uill  =0,  (22) 

£->0+  5 

and 

lim  K-4sll  =0.  (23) 

{-100  s 

Proof.  We  first  derive  algebraic  formulations  for  the  solutions  to  (20)  and  (21).  Recalling  [24,  Section  2.4],  we 
have  that  u *  is  of  the  form  (12)  with  coefficients  t)q  ,  Zq  satisfying 


K 

L  ' 

'  *7$  ' 

y 

.  LT 

0 

.  zo  . 

0 

and  thus 

z*0=  (Lt K_1L)_1  LtK_1  y;  rj*  =  K"1  (Lz*  -  y) . 

Similarly,  the  solution  to  (21)  can  also  be  written  in  the  form  (12)  with 

z*o„  =  (LTL)_1LTy;  <,  =  0. 

Exploiting  Proposition  2.5,  if  we  denote  for  convenience  by  and  z|  the  vectors  of  coefficients  associated 
with  n|,  we  need  to  show  that 

lim  (»7|,z|)  =  (t7S,z(5)  lim  (rj |,z£)  =  (tj^,z^). 

£->0+  £-100  s  s 

The  proofs  of  these  two  limits  are  straightforward  and  are  here  omitted.  □ 

2.6.2.  Connection  with  3D-VAR 

To  provide  insights  about  the  formulation  and  its  intimate  connection  with  3D- VAR,  we  derive  APBDW 
starting  from  3D- VAR  exploiting  a  probabilistic  interpretation.  With  this  in  mind,  we  define  the  3D- VAR 
statement  in  a  variational  fashion  (  [4,  Chapter  2]):  find  u*  GU  such  that 

■u*  :=  arg  min  I||u  -  ubk ||2  +  \\\CM(u)  -  y||^,; 

U&A  Z  Z 

where  ubk  is  the  non-parametric  background,  Cm{u)  =  {u(x  1), . . .  ,u(xm)),  W  is  the  observation  error  covari¬ 
ance,  and  ||d||w  =  V dTWd.  We  observe  that  if  we  substitute  ubk  with  the  parameterized  background  ubk(n) , 
(i  £  V,  we  obtain  the  so-called  partial  spline  model  (  [33,  Chapter  9]):  find  (jP  ,/rf)  G  V  x  U  such  that 

(fP,u*)  =  arg  min  h\u  -  ubk(n)\\2  +  h\CM(u)  -  y||w; 

(^,«)epxu  Z  Z 

or,  equivalently, 

(MW)  =  arg  min  hr]\\2  +  )- \\C.M{ubk{n)  +  17)  -  y||w,  (24) 

(n,r))eVxU  Z  Z 

with  u*  —  uhk(n*)  +77*.  If  we  introduce  the  rank- N  approximation  (  [7]  )  of  the  best-knowledge  field  uhk(n), 

N 

UN  (*^5  /^)  =  ^  ^  (m)  Cn  (x)  5 

n=l 


X  G 
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we  can  approximate  statement  (24)  as  follows: 


N 


=  arg  minxw  i||?7||2  +  -||  Cm  (^2  (n  +  v)  —  2/||w* 

’  n—  1 


(25) 


We  can  now  relax  problem  (25)  as  follows: 


N 


(0W)  =  arg^  mm  ^  ^IM]2  +  ^  Cn  +  v)  ~  V  ||w> 


n— 1 


which  can  also  be  written  as 


(z*,?7*)  =  arg  min  -\\ijf  +  \\CM(z  +  p)  -  y\\  , 

(z,r))£ZNxU  Z 


where  Zjv  :=  span{£n}^=1.  The  latter  corresponds  to  the  two-field  APBDW  formulation  (10)  for  W  =  ^1. 

We  provide  some  remarks.  First,  our  derivation  allows  us  to  interpret  APBDW  as  a  convex  relaxation  of  the 
partial  spline  model  for  a  parametric  affine  background.  Instead  of  penalizing  the  difference  between  the  state 
estimate  and  the  manifold  A4bk  =  { ubk{p )  :  p  £  V},  we  penalize  the  distance  from  the  linear  space  Zpj.  This 
derivation  motivates  the  interpretation  of  z£  as  the  deduced  background  estimate,  the  component  of  the  state 
informed  by  the  prior  knowledge  of  the  system,  and  the  interpretation  of  pt  as  update.  Second,  this  derivation 
allows  to  interpret  APBDW  as  a  Gaussian  linear  system  with  an  improper  prior.  This  interpretation  has  been 
first  proposed  by  Wahba  in  [32] .  In  this  work,  we  do  not  exploit  this  interpretation  to  develop  credible  intervals 
for  the  state  estimation  error. 

We  now  comment  on  the  relaxation  process.  If  we  denote  by  p  £  V  the  centroid  of  the  parameter  space  and 
we  normalize  the  directions  (1 , . . . ,  Ov,  we  typically  observe  that 


max  \<j)\(p)  -  <j>\{p)\  »  max  | fc{p)  -  (/>2(/z)|  > 
nev  nev 


The  relaxation  step  that  leads  to  APBDW  discards  this  information  associated  with  the  best-knowledge  map. 
Despite  the  gain  in  computational  efficiency,  this  relaxation  might  lead  to  a  deterioration  in  performance. 
Extension  of  APBDW  in  this  direction  is  subject  of  ongoing  research.  In  this  respect,  we  mention  the  work  of 
Binev  et  al.  (  [5] )  that  proposed  the  multi-space  problem  to  take  into  account  the  anisotropy  in  the  coefficients 
of  the  expansion. 


3.  Error  analysis 

We  present  a  priori  and  a  posteriori  estimates  for  the  L2(0)  state-estimation  error  \\utrue  —  w|||L2(f2)-  The 
importance  of  the  error  analysis  is  twofold.  First,  it  motivates  our  formulation  from  a  theoretical  viewpoint. 
Second,  it  provides  insights  about  the  role  of  the  different  pieces  of  our  formulation:  the  regularization  parameter 
£,  the  background  space  Z n,  the  kernel  K  and  the  centers  Xm- 

3.1.  A  priori  error  analysis 

In  order  to  derive  error  bounds  for  the  L2(fl)  state-estimation  error  \\utrue  —  u^\\ ^2 ^ ,  we  must  first  introduce 
assumptions  on  our  dataset  T>m ■  To  our  knowledge,  three  different  scenarios  have  been  considered  so  far. 

(1)  Random-design  regression :  the  pairs  {(xm,  ym)}m= 1  are  drawn  independently  from  a  joint  unknown 
distribution  p(x,Y)-  la  this  case,  the  objective  of  learning  is  to  estimate  the  conditional  expectation 
E[Y\X  =  x\. 
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(2)  Fixed-design  regression:  the  centers  Xm  =  {x\,  ■  ■  ■  ,xm}  are  fixed  (non-random)  points  in  LI,  while  the 
responses  AW  =  {iJm}m= 1  satisfy  ym  =  utrue(xm)+em,  where  utrue  :  SI  i — >•  M.  is  the  deterministic  held  of 
interest  and  ei, . . . ,  cm  are  independent  identically  distributed  (i.i.d.)  random  variables  with  zero  mean 
and  variance  er2,  em  ~  (0 ,  ct2). 

(3)  Scattered  data  approximation:  both  centers  Xm  and  responses  3W  are  non-random,  and  we  assume  that 
there  exists  some  unknown  8  >  0  such  that  | ym  —  utrue(xm)\  <  S  for  all  to  =  1, ,  M. 

The  hrst  scenario  has  been  extensively  studied  in  the  statistical  learning  literature  (see,  e.g.,  [9,30]).  We 
refer  to  [13]  for  a  complete  review  of  the  error  bounds  available.  The  second  scenario  has  also  been  studied 
in  statistics;  we  refer  to  the  survey  [12]  for  further  details  about  a  specific  class  of  kernels.  Finally,  the  third 
scenario  has  been  studied  in  approximation  theory  and  radial  basis  functions  (see,  e.g.,  [34]).  From  the  modeling 
perspective,  the  hrst  scenario  refers  to  the  case  in  which  we  do  not  have  control  on  the  observation  centers,  the 
second  scenario  addresses  the  problem  of  random  error  in  the  measurements,  and  the  third  scenario  addresses 
the  problem  of  systematic  deterministic  error. 

In  the  next  two  sections,  we  present  error  bounds  for  both  the  second  and  the  third  scenarios.  Our  analysis 
for  the  third  scenario  is  largely  inspired  by  the  work  of  Krebs,  Louis  and  Wendland  (  [18]).  On  the  other  hand, 
the  analysis  for  fixed-design  regression  seems  new.  We  state  upfront  that  in  the  remainder  of  this  section  we 
assume  that  Z v  C  U. 

3.1.1.  An  a  priori  error  bound  for  scattered  data  approximation 
We  state  the  main  result  of  this  section. 


Proposition  3.1.  Let  LI  be  a  Lipschitz  domain  and  letU  be  the  Sobolev  space  HT(Ll)  with  r  >  d/2.  Let  0n,m 
in  (3)  be  strictly  positive. 

Then,  if  utrve  £  U,  the  following  holds: 


\utrue_u*  ||2 


Ill!2(fi)  <  cn,Xm  (2||n*xut™||  +  +  hxM m  (5  +  ) 


where  Cn,xm  defined  as 


L2{Q) 


CN’XM  ''  hlfm^uP  +  hilluB 


“6 u  nxM  ll11^ “II  t  nxM\\u\\e.2(xM) 
Ml e2(xM)  ~  \Jj2m=iu(xm)2,  and  the  fill  distance  hXM  as 


(26a) 


(26b) 


hxM=su p  min  \\x  -  xm\\2. 

jell 


(26c) 


Proof  of  Proposition  3.1  is  technical  and  for  this  reason  we  present  it  in  Appendix  A.  In  the  remainder  of 
this  section,  we  state  a  number  of  remarks. 

Remark  3.2.  We  observe  that  the  constant  Cn,xm  is  associated  with  the  maximum  eigenvalue  associated  to 
a  generalized  eigenproblem.  Provided  that  the  inf-sup  constant  0n,m  >  0  and  hxM  <  1,  the  constant  Cx,xM 
defined  in  (26b)  can  be  estimated  as  follows: 


(27) 


where  Cjv,m  is  defined  in  (6).  We  rigorously  prove  (27)  in  Appendix  A.  We  remark  that  to  practically  estimate 
C n.xm  ,  we  need  to  numerically  approximate  the  maximum  eigenvalue  of  the  generalized  eigenproblem  associated 
with  Cn,xm  ■  We  refer  to  [14]  for  a  discussion  on  the  use  of  meshless  methods  based  on  csRBF  for  the  solution 
to  eigenproblems. 
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Remark  3.3.  For  quasi-uniform  grids,  hxM  ~  M  1/d,  for  M  — >  oo,  the  right-hand  side  reduces  to 


li2(o)  ~  ^ 


in; 


|2  .2t 

I  nxM 


1  +  x 


+  62  (1  +  A)2 


where 


v~mz±utrue\\ 

5 


By  minimizing  with  respect  to  A,  we  obtain  that  the  asymptotically  optimal  choice  of  £  is 


£ 


2/3 


h 


4/3r 
Ym  ' 


(28a) 


(28b) 


(29a) 


For  this  choice  of  the  hyper-parameter,  we  obtain: 

\Wtrue  -  <11  i2(0)  <  O  (||nzxutrue||2/3  h%*Ts4/3  +  s2)  m  oo.  (29b) 

We  observe  that  for  any  finite  <5  >  0,  we  do  not  expect  convergence  in  a  L 2  sense.  We  also  observe  that 
the  optimal  value  of  £  is  directly  proportional  to  5,  inversely  proportional  to  the  background  best-fit  error 
\\Uz±utrue\\  and  decreases  as  M  increases. 

Remark  3.4.  In  the  case  of  perfect  measurements,  estimate  (26)  reduces  to 

\Wtrue  <|||2(o)  <  \cn,Xm  (16 h%M  +  hdXMMi)  \\0.z±utrue\\2 .  (30) 


If  we  neglect  the  factor  Cn,xm ,  we  observe  a  multiplicative  effect  between  M  convergence  (associated  with  the 
update)  and  N  convergence  (associated  with  the  deduced  background). 

3.1.2.  A  priori  error  bounds  for  fixed- design  regression 

We  first  introduce  some  notation.  First,  we  define  the  matrix  Aj  £  RA'+M>JV+M , 


£)M\m  -\-  IK  IL 

Lt  0  ’ 


(31a) 


associated  with  the  linear  system  (13).  Then,  we  introduce  E  £  RM,M 


H  m  0 
0  0 


Finally,  we  introduce  M  £  mN+M,N+M  such  that 


:=  /  ifi(x)ifi'(x)dx't  ipi(x) 


Kx.  i  = 

Ci—M  i  =  M  +  1, . . . ,  M  +  N 


(31b) 


(31c) 


We  can  now  state  the  error  bound. 

Proposition  3.5.  Let  LI  be  a  Lipschitz  domain  and  letU  be  the  Sobolev  space  HT(Ll)  with  r  >  d/2.  Let  /3n,m 
in  (3)  be  strictly  positive. 
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Then,  if  utrue  £  U,  the  following  holds: 


E 


\utrue_u*  ||2 


{lli2(n) 


<  \Cn,xm  (16 h%M  +  hdXMMf )  \\Hz,utrue\\2  +  2a2  trace  (A’1 


MA”'S 


(32) 


Proof  of  Proposition  3.5  is  presented  in  Appendix  A.  We  observe  that,  unlike  in  the  previous  case,  it  is  not 
evident  how  to  provide  explicit  estimates  for  the  optimal  value  of  f.  We  also  observe  that  (32)  can  be  easily 
extended  to  correlated  noise. 


3.2.  A  posteriori  error  analysis 

Next  result  provides  the  identity  of  interest. 

Proposition  3.6.  Let{xi\\-i  be  drawn  independently  from  an  uniform  distribution  over  Cl .  Letyi  =  utrue(xi)  + 
5i  +  €i,  where  e±, . . .  ej  are  i.i.d.  random  variables  such  that  e,  ~  (0,  cr2)  and  Si, . . . ,  6i  are  deterministic  unknown 
disturbances.  Let  us  further  assume  that  {x,}l=1  and  {e,}f=1  are  independent  random  sequences. 

Then,  we  have  that  the  mean  squared  error 


MSE !  :=  -  J2  (Vi-ulixi))2 

i=l 


(33) 


satisfies 

E [MSEj]  =  E2mean  +  +  j  Si  -  ^7  E  Si  (Jn  utrue(x)  ul(x)  dx^j 

where  is  defined  as  follows: 


(34) 


Elvean  :=  j^j  («(/*)  “  Uc)2  dx.  (35) 

Proof.  To  simplify  notation,  we  introduce  the  random  sequence  {e*  =  utrue(xi)  —  u|(xj)}f=1.  We  observe  that 
ei, . . . ,  ei  are  i.i.d.  and  E[e2]  =  py \\utrue  —  u\  ^ llz.2(o) •  Then,  exploiting  linearity  of  the  expected  value  operator 
and  the  fact  that  {xl}.f=1  and  {e,}l=1  are  independent,  we  find 

i  1  o  1 

E [MSE!]  =  E  [el]  +E  [e2]  +  jE  5i  ~  J  E  <^EN- 

1  i= 1  1  2=1 

Thesis  follows.  □ 


In  absence  of  systematic  noise  (Si  =  0),  identity  (34)  reduces  to 


E  [MSEI]=E2mean  +a2. 


(36) 


Estimate  (36)  shows  that  for  random  noise  (5,  =  0)  the  mean  squared  error  (33)  can  be  used  to  asymptotically 
bound  the  squared  L2(fl)  error.  Furthermore,  since  a2  is  independent  of  the  state  estimate,  minimizing  the 
mean  squared  error  is  equivalent  to  minimize  the  L2(tt)  error.  The  latter  observation  motivates  the  adaptive 
strategy  presented  in  section  4. 
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4.  Model  adaptation 

As  observed  in  the  previous  sections,  our  procedure  depends  on  a  fair  amount  of  design  choices,  which  include 
the  choice  of  a  number  of  hyper-parameters  and  the  choice  of  the  observation  centers  and  background  space 
In  section  4.1,  we  discuss  how  to  exploit  the  error  analysis  to  perform  some  design  choices  a  priori.  Then, 
in  section  4.2,  we  discuss  the  adaptive  strategy  used  to  tune  the  parameters  of  the  formulation  after  having 
acquired  data. 

4.1.  A  priori  considerations 

We  first  recall  the  APBDW  formulation  and  we  discuss  separately  each  element  of  the  formulation.  The 
APBDW  state  estimate  u|  is  given  by 


M 


u}  :=  argmin  ^\\Uzj 

u&A  * 


Jj  ('“(A™)  -  Vm)2 

m—1 


(37) 


where  Zjv  =  span{Ci}()Li .  We  observe  that  the  formulation  depends  on  the  regularization  parameter  £,  the 
sensor  locations  XM  =  {xm}^=1  and  the  choice  of  the  RKHS  norm  ||  •  ||.  Recalling  Moore- Aronszajn  theorem, 
the  latter  choice  is  equivalent  to  the  choice  of  the  reproducing  kernel  K. 

The  hyper-parameter  £  >  0  controls  the  amount  of  regularization  introduced:  for  £  =  0,  the  solution  to  (37) 
interpolates  exactly  the  data  while  for  £  — >  oo,  the  solution  to  (37)  converges  to  the  least-squares  solution.  Our 
error  analysis  shows  that  the  choice  of  £  strongly  depends  on  the  noise  variance  and  on  the  maximum  systematic 
error  S;  both  these  quantities  are  hard  to  estimate  a  priori. 

We  now  discuss  the  choice  of  the  kernel  K.  Since  in  this  work  we  employ  csRBF  kernels,  this  reduces  to 
the  choice  of  the  hyper-parameters  k  and  7  in  Kx(x,  y)  =  (f>d,k{”/\\x  —  y\\ 2),  where  (j>d,k  is  defined  in  (19).  As 
stated  in  Proposition  2.9,  the  parameter  k  determines  the  Sobolev  regularity  of  the  RKHS.  Since  in  practical 
applications  we  expect  that  measurements  are  noisy,  recalling  estimate  (29),  we  choose  the  minimal  value  of 
k  for  which  K  is  a  reproducing  kernel,  which  is  k  =  1  for  d  =  1,2  and  k  =  0  for  d  >  3.  The  parameter  7 
regulates  the  length  scale  of  the  kernel  functions.  In  our  experience,  for  small  values  of  M,  the  choice  of  7 
weakly  influences  the  results;  we  can  thus  pick  7  a  priori  such  that  the  kernel  functions  Kx.m  share  the  same 
length  scale  with  the  elements  of  Zjy  On  the  other  hand,  for  larger  values  of  M,  the  choice  of  7  significantly 
influences  the  performances  of  the  method  and  it  must  be  adapted  using  data.  We  remark  that  by  changing  7 
we  effectively  modify  the  inner  product  (•,  •)  . 

We  comment  on  the  choice  of  the  sensor  locations.  If  we  neglect  the  effect  of  the  sensor  locations  on  the 
stability  constant  Cn,xm,  the  error  analysis  suggests  to  choose  the  observation  centers  to  minimize  the  fill 
distance  hxM  in  (26c).  For  N  ~  M,  sensor  location  might  influence  significantly  the  value  of  Cn,xm-  As 
a  result,  it  might  be  worth  to  choose  the  observation  centers  to  maximize  Cn,xm  f°r  any  given  M.  For  the 
PBDW  formulation,  in  [24],  the  authors  propose  a  Greedy  strategy  for  the  selection  of  the  observation  centers 
to  maximize  the  inf-sup  constant  (3n,m  defined  in  (3).  In  this  work,  we  simply  consider  equispaced  observation 
centers  and  we  refer  to  a  future  work  for  more  elaborated  strategies  for  the  selection  of  the  observation  centers 
that  address  both  stability  and  approximation. 

Motivated  by  the  previous  considerations,  in  our  numerical  simulations,  we  choose  adaptively  the  regulariza¬ 
tion  parameter  £  and  the  kernel  parameter  7.  In  the  next  section,  we  present  the  algorithm  used  to  perform 
online  adaptation. 

4.2.  Adaptive  procedure 

In  the  Statistical  Learning  literature,  several  approaches  have  been  presented  to  tune  the  design  parameters 
of  reguralized  regression  formulations;  we  refer  to  [15,  Chapter  7]  and  to  [17]  for  a  thorough  overview.  The 
adaptive  strategy  depends  on  the  size  of  the  dataset,  which  in  our  context  corresponds  to  the  amount  of  available 
transducers.  If  we  denote  by  L  the  number  of  available  transducers  and  by  T>l  =  {(27,  ye)}f—  1  the  corresponding 
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dataset,  for  large  values  of  L,  the  holdout  method  is  the  most  widely  used  approach.  On  the  other  hand,  for 
small  values  of  L ,  K-fold  cross-validation  is  typically  employed.  In  the  remainder  of  this  section,  we  briefly 
review  these  techniques  and  we  discuss  their  application  to  our  problem. 

The  holdout  method  partitions  the  dataset  VL  into  the  two  mutually  exclusive  subsets  VM  =  {(xm,  ym)}m= 1 
and  Vj  =  {(27,  t/j)}(=1.  Given  the  finite  dimensional  search  space  S)hyper  for  (£,  7),  we  generate  the  state  estimate 
based  on  the  training  set  and  then  we  compute  the  mean  squared  error  over  the  validation  set 

1  1 

MSE ,(£,7)  =  <7(*i))2  ,  (38) 

i= 1 

for  each  (£,  7)  in  S)hyper .  Finally,  we  choose  the  state  estimate  associated  with  the  choice  of  (£,  7)  that  minimizes 
MSEtf,  7)  over  j^»er. 

Recalling  Proposition  3.6,  if  {27};  are  drawn  from  an  uniform  distribution  over  and  the  disturbances  are 
homoscedastic,  this  choice  of  the  hyper-parameters  asymptotically  minimizes  the  L 2  state-estimation  error. 
This  result  holds  independently  of  the  strategy  employed  to  compute  the  state  estimate  and  thus  independently 
of  the  strategy  employed  to  select  the  training  observation  centers.  Motivated  by  this  observation,  in  this  work, 
we  choose  an  uniform  grid  of  sensors  for  training  and  we  choose  the  validation  sensors  by  sampling  uniformly 
over  12.  As  discussed  in  [29],  if  is  an  accurate  description  of  the  true  field  utrue,  MSEi  rapidly  converges  to 
its  expected  value.  Therefore,  the  number  I  of  measurements  that  should  be  reserved  for  validation  is  modest. 

Cross-validation  is  based  on  the  partition  of  the  dataset  T>l  into  k  equal-sized  subsamples  (folds)  {2?^}ji=1. 
Of  the  k  folds,  a  single  fold  is  retained  for  testing  and  the  remaining  k  —  1  folds  are  used  for  training.  The 
procedure  is  then  repeated  k  times  with  each  of  the  k  folds  used  once  as  the  validation  dataset.  In  the  limit 
L  =  k.  the  procedure  is  known  as  Leave-One-Out  Cross-Validation  (LOOCV). 

We  comment  on  the  application  of  K-folcl  Cross-Validation  to  the  APBDW  framework.  First,  we  observe 
that,  even  for  moderate  L ,  K-fold  Cross-Validation  can  be  quite  expensive  if  k  ss  L.  For  this  reason,  generalized 
cross-validation  strategies,  which  focus  on  computing  computationally  inexpensive  approximations  of  the  error 
indicator,  have  been  developed.  We  refer  to  [15,  Chapter  7.10]  and  to  the  references  therein  for  further  details. 
Second,  since  in  our  setting  sensor  locations  are  not  chosen  randomly,  it  seems  very  hard  to  justify  cross- 
validation  through  a  rigorous  mathematical  argument. 

In  this  paper,  we  exclusively  employ  holdout  validation  and  we  refer  to  a  future  work  for  the  application  of 
more  advanced  cross-validation  strategies. 


5.  Numerical  results  for  two  synthetic  problems 

In  this  section,  we  illustrate  the  behavior  of  the  APBDW  formulation  through  two  two-dimensional  model 
problems:  an  acoustic  Helmholtz  problem1,  and  a  heat  transfer  problem2. 

Before  presenting  the  results,  we  briefly  summarize  some  details  concerning  the  implementation  of  our 
method.  In  both  tests,  we  employ  csRBF  with  k  =  1;  recalling  Proposition  2.9,  this  corresponds  to  U  = 
iJ2-5(R2).  We  rely  on  holdout  validation  for  the  choice  of  £  and  of  the  kernel  parameter  7:  we  consider  uniform 
grids  of  training  observation  points  {xm}^=1,  and  uniformly  random  generated  validation  points  {xi}\=l.  In 
all  our  tests,  we  consider  I  =  .  Performances  are  measured  by  computing  the  mean  L2  error  (35). 


1This  model  problem  is  the  same  considered  in  [24,  Section  3]. 

2This  model  problem  is  a  slight  modification  of  thermal  block  problem  studied  in  [26]. 
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5.1.  Application  to  a  synthetic  acoustic  problem 

5.1.1.  Problem  definition 

Given  the  domain  SI  =  (0,  l)2,  we  define  the  acoustic  model  problem: 

(  —  (1  +  ieffi)  Au(fi)  —  y2u(y)  =  /r  ( 2xf  +  eX2)  in  SI, 

\  dnu(n)  =  0  on<9ff, 


(39) 


where  /.i  >  0  is  the  wave  number  and  e  =  10~2  is  a  fixed  dissipation. 

To  assess  the  performance  of  the  APBDW  formulation  for  various  configurations,  we  define  the  true  field 
utrue  as  the  solution  to  (39)  for  some  y true  £  V  =  [2, 10].  Figure  1  shows  the  true  field  for  three  choices  of  the 
wave  number  y.  We  approximate  the  solution  using  a  triangular  P5  finite  element  discretization. 


Figure  1.  Acoustic  synthetic  problem:  visualization  of  the  truth  solutions  associated  with  the 
synthetic  Helmholtz  problem. 


We  introduce  some  definitions.  We  consider  noisy  observations  with  additive  Gaussian  noise: 

iid 

ye  =  Utrue(xe )  +  eg,  eg  Af(0,  a2).  (40) 

Then,  we  introduce  the  background  spaces  {2jv}iv,  built  using  the  weak-Greedy  algorithm.  As  mentioned 
before,  we  employ  csRBF  with  k  =  1;  recalling  Proposition  2.9,  this  corresponds  to  U  =  iJ2  5(K2). 

5.1.2.  Results 

Figure  2(a)  investigates  the  convergence  with  N  for  fixed  number  of  sensors  M  in  the  absence  of  noise  (a  =  0). 
To  assess  the  performance,  we  compute  the  relative  L2  error  averaged  over  20  fields  associated  with  different 
choices  of  the  parameter  fi: 


1  y-  \\utrue(fi)  -u£(Ai)||i,2(n) 

^  ^  \Vtram\  |k™e(M)|b(Q)  1  j 

We  observe  monotone  convergence  with  respect  to  N  of  Eavg  in  the  absence  of  noise. 

Figure  6(b)  shows  the  convergence  with  M  for  fixed  N  and  noise-free  measurements.  We  assess  performances 
by  computing  Eavg  in  (41)  averaged  over  20  fields.  We  observe  that  rate  of  convergence  with  M  weakly 
depends  on  the  value  of  N:  in  this  test,  we  observe  Eavg  ~  M~1A  for  N  =  0,  Eavg  ~  M-1-5  for  N  =  3,  and 
Eavg  ~  M~1A  for  N  =  4.  This  confirms  the  multiplicative  effect  between  N  convergence  and  M  convergence 
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observed  in  Remark  3.4.  We  further  observe  that  convergence  with  M  is  higher  than  the  expected  M-125  (see 
(30),  r  =  2.5).  This  might  be  related  to  the  validation  procedure  for  7  that  effectively  makes  the  inner  product 
depend  on  M. 

In  Figure  6(c),  we  study  the  convergence  with  M  in  the  noisy  case.  As  in  the  previous  tests,  we  assess 
performances  by  computing  Eavg  in  (41)  averaged  over  10  different  fields  associated  with  choices  of  the  parameter 
[i  and  over  15  realizations  of  the  random  noise.  We  consider  the  background  Zm— 2-  We  observe  that  the 
estimated  convergence  rate  in  the  noisy  case  is  M~0  45  for  all  values  of  standard  deviations  <7  considered. 
Interestingly,  we  observe  that  the  convergence  rate  does  not  depend  on  a. 


Figure  2.  Acoustic  synthetic  problem.  Figure  (a):  convergence  with  N  for  fixed  M.  Figure 
(b):  convergence  with  M  for  fixed  N.  Figure  (c):  convergence  with  M  for  noisy  data  ( N  =  2). 


We  finally  investigate  the  connection  between  the  optimal  value  of  £  and  the  signal-to-noise  ratio.  In  Figure 
3,  we  compute  the  mean  squared  error  over  the  validation  set  for  the  estimation  of  the  state  associated  with 
the  parameter  //  =  3.68.  We  consider  M  =  64  and  we  compute  the  mean  squared  error  based  on  I  =  32.  For 
this  test,  we  employ  the  background  ZN= 2.  While  for  a  =  0.05  the  mean  squared  error  is  stable  as  £  — ►  0+,  for 
<7  =  1  the  mean  squared  error  increases  as  £  — >■  0+  and  reaches  a  minimum  for  £  ss  10~2.  Since  the  level  of  noise 
is  in  practice  unknown,  this  numerical  result  empirically  motivates  the  importance  of  adapting  the  value  of  £. 


(a)  a  =  0.05  (b)  a  =  1 


Figure  3.  Acoustic  synthetic  problem:  behavior  of  min7  MSE(^,  7)  for  two  different  noise-levels. 
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5.2.  Application  to  a  synthetic  non-smooth  problem 

5.2.1.  Problem  definition 

In  this  section,  we  consider  the  problem: 


—V  •  (k(//)Vu(/j))  =  0  in  fl 

=  5  on  riur2ur3 

u(n)  =0  on  1?4 


(42a) 


k(x,h)  = 


1 

Mi 


in  th, 
in  tlj_|_i ,  i 


where 

(  1  onrlt 

g(x)  =  <  0  onr2,  (42b) 

[  1  —  2cci  onlV 

The  subdomains  {tli},:  and  the  edges  {Tj}j  are  shown  in  Figure  4,  while  the  parameter  /i  belongs  to  the  compact 

set  V  =  [0.5,  2. 5]8.  Figure  5  shows  the  true  held  for  three  different  choices  of  the  parameters3.  Computations 
are  based  on  a  continuous  P3  Finite  Element  discretization. 


r4 


03 

rig 
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OO 

G 

Figure  4.  Thermal  block  synthetic  problem:  computational  domain 

Unlike  the  acoustic  problem  considered  before,  here  the  solution  does  not  belong  to  the  native  space  U  = 
U2-5(K2)  associated  with  the  kernel  employed.  By  applying  APBDW  to  this  problem,  we  want  to  assess 
how  our  method  performs  when  applied  to  non-smooth  fields.  As  in  the  previous  example,  we  consider  noisy 
observations  with  additive  Gaussian  noise  (40).  Then,  we  introduce  the  background  spaces  {Zn}n,  built  using 
the  weak-Greedy  algorithm. 

5.2.2.  Results 

Figure  6  replicates  the  results  shown  in  Figure  2  for  the  thermal  block  problem.  As  in  the  previous  case,  in 
Figure  6(a),  we  observe  monotone  convergence  with  respect  to  N  of  Eavg  in  the  absence  of  noise.  In  Figure 
6(b),  we  observe  E%vg  ~  M~100  for  N  =  0,1,6.  As  expected,  due  to  the  lack  of  regularity  of  the  state 
field,  convergence  rate  is  significantly  lower  compared  to  the  previous  example.  However,  we  still  observe  a 
multiplicative  effect  between  N  convergence  and  M  convergence.  Finally,  in  Figure  6(c),  we  observe  that  the 
estimated  convergence  rate  in  the  noisy  case  is  M~ 0  3  for  all  values  of  the  standard  deviations  considered 
(a  =  0.1,  0.2,  0.4).  In  the  noisy  case,  convergence  rate  is  comparable  with  the  previous  example. 

3The  values  of  the  parameter  are  /d  =  [1.4418,1.3487,1.0296,1.9081,1.8933,1.7704,1.3921,1.1642],  /d  = 

[0.5672, 2.0695, 1.7911, 1.3486,  2.4778, 1.4982, 1.1617,  0.5927],  /z3  =  [0.6360, 1.4003, 1.5401, 1.1370,  0.9188, 1.2972,  0.9252,  0.7558]. 
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(A)  u( fl1)  (b)  «02)  (C)  u(/i3) 

Figure  5.  Thermal  block  synthetic  problem:  example  of  the  true  field  for  three  values  of  the 
parameters. 


Figure  6.  Thermal  block  synthetic  problem.  Figure  (a):  convergence  with  N  for  fixed  M. 
Figure  (b):  convergence  with  M  for  fixed  N.  Figure  (c):  convergence  with  M  for  noisy  data 

(N  =  2). 


6.  Application  to  a  thermal  patch  configuration 
6.1.  Description  of  the  experiment 

The  thermal  patch  system  consists  of  a  1.5mm  thick  acrylic  sheet  heated  from  behind  by  a  resistive  patch. 
Heat  is  generated  through  an  electrical  resistance  with  input  power  equal  to  0.667 W.  The  goal  of  the  data 
assimilation  procedure  is  to  estimate  the  temperature  field  over  a  portion  ^iobs,dlrn  0f  the  external  surface  of  the 
plate  at  the  steady-state  limit. 

We  now  describe  the  data  acquisition  procedure.  We  use  an  IR  camera  (Fluke  Ti  9)  to  take  measurements 
in  the  rectangular  region  =  [—23.85,23.85]  x  [—17.85, 17.85]  (mm)  centered  on  the  patch.  Figure  7(a) 

shows  the  IR  camera.  After  the  patch  power  is  turned  on,  we  take  measurements  using  a  sampling  time  of  4 
seconds  until  steady  state  is  reached;  the  total  duration  of  the  experiment  is  roughly  5  minutes.  The  external 
temperature  is  about  20°C',  roughly  constant  throughout  the  experiment.  Each  surface  measurement  taken  from 
the  IR  camera  corresponds  to  160  x  120  pixel-wise  measurements;  the  pixel  size  is  roughly  A hdemce  =  0.3mm, 
which  is  much  smaller  than  the  spatial  length  scale  of  the  phenomenon  of  interest. 

In  view  of  the  mathematical  description  of  the  problem,  we  present  formal  definitions  for  the  geometric 
quantities  involved.  First,  we  introduce  the  domain  Qbk ’dlm  C  R3  corresponding  to  the  three-dimensional 
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acrylic  sheet.  We  denote  by  Ypatch’dlrn  C  R2  the  surface  of  the  sheet  attached  to  the  patch,  and  we  denote 
by  Tin’dim  the  face  of  the  sheet  that  contains  Tpatch'dim.  We  recall  that  YLobs’dim  C  dflbk’dim  is  the  region  in 
which  the  IR  camera  takes  measurements.  Then,  we  introduce  the  Cartesian  coordinate  system  xdlrnxdlmxf‘rn; 
according  to  our  definitions,  the  IR  camera  takes  measurements  in  the  xd%mxdlm  plane.  Figures  7(b)  and  (c) 
clarify  the  definitions  of  YLobs'dlm,  Ypatcl l,*m  and  rm,dlm  and  show  the  characteristic  dimensions  of  the 

patch. 


(A)  (B)  (C) 


Figure  7.  Thermal  patch  problem.  Figure  (a):  IR  camera.  Figures  (b)  and  (c):  mathematical 
description  of  the  acrylic  sheet.  L  =  22.606 mm,  H  =  9.271ttito. 

We  now  briefly  comment  on  the  noise  in  the  dataset  and  we  define  the  true  field.  In  Figure  8,  we  show 
two  spatial  slices  of  the  field  uobs’dlrn  —  ufiit,dtm  .  ^he  field  uobs’dlm  is  obtained  directly  from  the  IR  camera, 
while  ufllt’dlm  is  obtained  applying  a  Wiener  filter  (see,  e.g.,  [20])  based  on  a  3  by  3  pixel  averaging  to  the  field 
uobs,dim _  Comparing  ufzlt.dlm  and  uobs,dim ,  we  can  ,]eciuce  that  the  magnitude  of  noise  in  the  measurements  is 
approximately  ±0.5°C,  roughly  independent  of  the  spatial  position. 


60 

50  » 
7/5 

CD 

40  0 
30 

-0.02  -0.01  0  0.01  0.02 

—dim 

x\ 

(a)  uobs’dim 


(b)  a;glm  =  0.2 mm 


(c)  xdlrn  =  —  12.0mm 


Figure  8.  Thermal  patch  problem:  comparison  between  filtered  and  unfiltered  fields.  Figure 
(a):  observed  thermal  field  uobs’dlm .  Figures  (b)  and  (c):  spatial  slices  of  the  difference  uobs’dzm— 

filt, dim 
UJ  ’ 
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6.2.  Engineering  motivation 

We  shall  now  motivate  this  model  problem  from  the  engineering  standpoint.  Full- field  information  is  typically 
not  available;  in  practical  applications,  we  envision  a  system  with  a  local  sensor  or  a  small  sensor  array.  For 
this  reason,  we  want  to  design  a  data  assimilation  state  estimation  procedure  that  is  able  to  reconstruct  the  full 
field  based  on  a  small  amount  of  local  measurements. 

Since  the  IR  camera  provides  full-field  information,  in  this  work,  we  synthetize  local  measurements  -  the 
experimental  input  to  our  methods  -  from  the  IR  camera  to  obtain  £^s  =  £{uobs,xc^s)  where  the  observation 
functional  £(-,x^a)  is  designed  to  represent  our  fictitious  measurement  in  the  sensor  location  x^s  G  Llobs .  We 
observe  that  the  IR  camera  permits  us  to  conduct  convergence  studies  that  would  typically  not  be  feasible  in 
actual  field  deployment. 


such  that 


6.3.  Mathematical  model  and  background  space 

We  first  present  the  parametrized  best-knowledge  model  in  dimensional  form:  find  ubk’dlrn  •  Qbk,dxm 

^ ybk,dim  _  q  ^  Qbk,dim 

i  ~,( „.bk,dim  O\room,dim\  „ dim _  ,  T^in.dim 

X\u  ’  —  cJ  ’  )  — g  Xr patch,dirn  oni  5  , 

On  dQbk>dirn  \  pin, dim 


i  y 


Kdnubk’dim  =  0 


(43) 


where  7  is  the  convective  heat  transfer  coefficient,  n  is  the  thermal  conductivity,  ©room,*m  =  20°C  is  the 
room  temperature,  and  gdlm  is  the  incoming  flux  modeling  the  heat  exchange  between  the  patch  and  the  plate. 
Textbook  values  for  the  model  parameters  are  n  =  0.2W/m,  7  =  lOW/m2. 

We  now  adimensionalize  the  equation  and  we  define  the  best-knowledge  manifold.  Towards  this  end,  we 
define 


ubk(x )  = 


.bk.dim 


(Lx)  -  0r 


.dim 


A0 


(44) 


where  A0  =  50 °C  is  a  rough  approximation  of  the  temperature  difference  between  the  far-held  and  the  center 
of  the  patch,  L  =  22.606mm  is  the  length  of  the  edge  of  the  patch  (see  Figure  7).  We  observe  that  ubk  =  ubk( g) 
satisfies 


-A  ubk(g)  =  0, 

inf lbk, 

dnubk(g)  +  imbk(g)  =  g 

onr™, 

(45a) 

O 

II 

ondnbk\Tin, 

where  /i  =  Lj/k  ss  1.13  and  g  is  defined  as  follows: 


g(x)  =  C  Xrpatch(x)- 


(45b) 


Since  the  model  is  linear  with  respect  to  C  and  our  ultimate  goal  is  to  define  a  linear  space  associated  with 
the  best-knowledge  manifold,  we  can  simply  set  (7  =  1.  Assuming  that  the  estimate  of  k  is  accurate  and  that 
7  «  10  ±  5W/m2,  we  have  that  g  G  V  =  [0.5650, 1.650].  We  can  thus  define  the  best-knowledge  manifold  as 
follows: 

Mbk  =  {ubk(g)\^  :  (46) 

Some  comments  are  in  order.  Parametric  uncertainty  in  the  model  is  associated  with  the  value  of  g  (that  is, 
with  the  convective  heat  transfer  coefficient  and  the  thermal  conductivity),  while  non- parametric  uncertainty  is 
mainly  related  to  the  nonlinear  effects  associated  to  natural  convection  and  to  the  heat  exchange  between  the 
patch  and  the  sheet.  To  compute  the  solution  to  the  best-knowledge  model,  we  recur  to  a  P3  continuous  Finite 
Element  discretization  based  on  J\f  =  40000  degrees  of  freedom. 
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The  background  space  Zjy  associated  with  (45)-(46)  is  built  using  the  weak-Greedy  algorithm.  Then,  we 
restrict  the  space  Z///  to  the  domain  of  interest  =  Qobs  to  form  Z]y. 

6.4.  Numerical  results 

Figure  9  shows  the  convergence  with  M  for  the  RB  space  Z jy  for  N  =  0,  ./V  =  1  and  N  =  2.  For  this  test, 
we  consider  holdout  validation  for  £  and  7  with  I  =  M/2.  On  the  y-axis,  we  report  the  mean  squared  error 
computed  based  on  the  entire  full-field  information. 


Figure  9.  Thermal  patch  problem:  convergence  with  M  for  fixed  N . 

We  observe  that  for  M  «  100  we  reach  the  noise  level 

^  ,=  lim^n) 

'  \\u°bs\\mn)  ■ 

We  also  observe  that,  while  including  the  first  snapshot  leads  to  a  substantial  improvement  in  the  performances, 
considering  N  >  1  does  not  lead  to  any  substantial  improvement. 

7.  Conclusions 

In  this  paper,  we  presented  the  APBDW  approach  to  the  variational  data  assimilation  (state  estimation) 
problem.  The  approach  generalizes  the  PBDW  formulation  presented  in  [24]  to  the  case  of  pointwise  noisy 
measurements.  We  also  discussed  a  well-posedness  analysis  and  a  priori  and  a  posteriori  error  estimates  for  the 
L 2  state-estimation  error.  Our  well-posedness  analysis  also  holds  when  the  background  space  does  not  belong  to 
the  native  space  induced  by  the  kernel  employed.  On  the  other  hand,  the  error  analysis  relies  on  the  assumption 
that  both  the  true  field  and  the  background  belong  to  the  native  space.  We  finally  presented  synthetic  and 
experimental  numerical  results  to  prove  the  effectiveness  of  our  approach. 

We  now  identify  three  extensions  to  the  approach,  which  are  subjects  of  future  work.  First,  we  wish  to  design 
strategies  for  the  selection  of  the  observation  centers  that  address  both  stability  and  approximation.  This  would 
tighten  the  connection  between  our  approach  and  the  design  of  the  experiment.  Second,  we  wish  to  consider 
more  general  observation  functionals  of  the  form  ym  =  im(utrue)  +  em.  This  would  allow  us  to  take  into  account 
different  sources  of  information.  In  this  respect,  we  observe  that  the  well-posedness  analysis  does  not  depend 
on  the  fact  that  ym  =  utrue(xm)  +  em,  while  the  error  analysis  relies  on  the  form  of  the  observation  functionals. 
Finally,  we  wish  to  exploit  the  interpretation  of  APBDW  as  convex  relaxation  of  the  partial  spline  model  for 
a  particular  choice  of  the  background  to  derive  new  probabilistic  and  deterministic  error  bounds  and  possibly 
improve  the  performances  of  the  state-estimation  procedure. 
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Appendix  A.  Proofs  of  the  error  bounds 

In  this  appendix,  we  provide  the  proofs  of  the  results  presented  in  section  3.  The  appendix  is  organized  as 
follows.  In  section  A.l,  we  introduce  a  regularized  formulation  and  we  show  the  connection  with  our  formulation. 
In  section  A. 2,  we  exploit  the  results  for  the  regularized  problem  to  prove  Proposition  3.1.  Finally,  in  section 
A. 3,  we  prove  the  result  for  fixed-design  regression. 


A.l.  Preliminaries 

We  introduce  a  regularized  formulation  of  the  APBDW  statement  proposed  in  this  work:  given  A  >  0,  £  >  0, 
find  u*X  £&U  such  that 

=  argmin  J^{u)  :=  £|M||)iv  +  VM(u),  (47) 

u£.U 

where  the  seminorm  ||  •  \\\,n  is  defined  as 


HllAr  =  A||n2NH|2  +  ||n^HI2- 


(48) 


We  observe  that  for  any  A  >  0,  the  function  ||  •  \\\tN  is  a  norm  equivalent  to  ||  •  ||.  We  also  observe  that  for 
A  =  0,  problem  (47)  corresponds  to  (2). 

Next  proposition  summarizes  a  number  of  properties  of  problem  (47)  that  are  crucial  to  prove  the  error 
bounds  for  w|. 

Proposition  A.l.  Let  0n,m  >  0.  Then,  the  following  hold. 

(1)  For  any  A  >  0,  the  solution  to  (47)  exists  and  is  unique.  Furthermore,  if  we  introduce  rfx  ^  =  II zfu*x 

ZX£  =  we  have  that  G  span{Tlz±  KXm}™=1  and  G  span{UzN  KXm}™=1. 

(2)  For  any  f  >  0,  the  solution  ux  ^  converges  to  the  solution  u £  to  (2)  when  A  — >  0+. 

(3)  For  any  A  >  0,  the  following  bounds  hold: 


\\utrue  -  u*xJx,N  <  2||«t"‘e|U,JV  +  (49a) 

and 

\Wtrue  ulJe{XM)  <  VM  (g  +  •  (49b) 

We  prove  each  statement  separately. 

Proof.  ( statement  1)  For  any  A  >  0,  u  i— >  ||u||2  N  is  strictly  convex,  while  u  H >•  Vm(u)  is  convex.  This  implies 
that  for  any  £  >  0  the  objective  function  Jx^(u)  =  £||u||2  N  +  Vm(u)  is  strictly  convex.  Therefore,  existence 
and  uniqueness  of  the  solution  to  (47)  follow  from  [10,  Theorem  3,  Chapter  8.2], 

We  observe  that  iGSIg  $x’N  =  j^-znKx  +  J42j  Kx ,  is  the  feature  map  associated  with  (IT,  ||  •  ||a,iv)-  We 
have  indeed  that  for  all  v  G  IT  and  x  G  fi 


($*' ,v)x,N  =  ~(TlzNKx,v)  +  (n z±Kx,v)  =  (Kx,  H.Znv  +  H.z±v)  =  (Kx,  v)  =  v(x). 


Exploiting  the  representer  theorem  (see,  e.g.,  [34,  Theorem  16.1]),  we  have  that  ux^G  span{$^Y}))f=1.  As 
a  result,  we  have  that  r)x^  G  spanjll^r  4>)AY}-m=i ,  and  2^  G  spanjll^  <&x^}m=i  for  any  A  >  0.  □ 
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Proof.  ( statement  2)  Let  {A7},-  be  a  real  sequence  such  that  A j  — >  0+.  Exploiting  the  first  statement  of 
Proposition  A.l,  we  have  that  sequences  {r/x.  {zx.  belong  to  finite  dimensional  spaces  that  do  not 
depend  on  A.  Furthermore,  applying  Lemma  2.3,  we  have  that  they  are  uniformly  bounded  for  all  j.  Applying 
Bolzano- Weierstrass  theorem,  the  sequence  {u\  ^  admits  a  strongly  convergent  subsequence 

t0  e  U. 

We  now  show  that  u|  =  «|.  We  first  observe  that 

^ k  II 2  +11  ^Afc II 2  +  Vm(u *\k^)  — >  ^(ujt),  k  ->  oo. 

<C 


We  further  observe  that  for  any  Afc  >  0 

\k,()  —  k  =  1,2,..., 

and  by  taking  the  limit  on  both  sides,  we  obtain 

Since  u|  is  the  unique  minimizer  of  (2),  we  must  have  =  u£.  Furthermore,  by  the  same  argument,  must 
be  the  only  limit  point  of  the  sequence;  therefore,  the  entire  sequence  converges  to  Thesis  follows.  □ 

Proof.  ( statement  3)  For  A  >  0,  ||  •  || a,jv  is  a  norm  for  U\  therefore,  estimates  (49a)  and  (49b)  follow  directly 
from  [18,  Corollary  4.3]  and  [18,  Lemma  4.5]. 

The  extension  to  A  =  0  follows  by  observing  that  u*x  c  converges  to  when  A  — >  0+ .  □ 

Before  proving  the  error  bounds,  we  prove  (26b). 

Lemma  A. 2.  Let  Q  be  a  Lipschitz  domain  and  letlA  be  the  Sobolev  space  HT(Ll)  with  r  >  d/2.  Let  us  assume 
that  inf-sup  constant  Pn,m  defined  in  (3)  is  strictly  positive  and  hxM  <  1. 

Then, 

Cn,Xm  —  7  r  ~  ,2T-d\ 

mm{cjv,M ,  1  —  hXM  } 

where  cn,m  is  defined  in  (6)  and  C  depends  on  the  domain  Q  and  on  (•,  •). 

Proof.  Let  us  define  the  constant 


CxM  '■=  sup 


Il2(0) 


h%M\\u\\2  +  h%M\\u\\2(2{XM)' 

Recalling  [18,  Theorem  4.8],  CxM  is  bounded  from  above  by  a  constant  C  that  does  not  depend  on  M. 
Since  /?jv,m  >  0,  recalling  Lemma  2.3,  we  have  that 

mZ-u\\2P\\u\\l{xM)>cN,M\\uf, 

where  Cn,m  >  0  is  given  by  the  expression  in  (6).  Then,  we  observe  that 


h%M\\nz,ur  +  hiju\\tHXM)=  h%M(  II 


n^d|2  +  ||n112 


h2(*M), 

2 T-d\  id 


(hxM  k?xM)  M%(xm) 

>  cNMh%M\\uf  +  (1  -  h%~d)  h%M  \\u\\%{Xm) 

>  min {cN,M,  1  -  h%-d}  (h%M  ||u||2  +  h%M  M%(xm)] 
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As  a  result, 


„  _  IMIi2(fi)  /  IML2(fi)  \  1 

u&A  h%J\nz,ur  +  hiM\\u\\%(XM)  Ae u  +  hdXM\\u\\%(xM)  min{cjv,M)  1  -  h2£M  d} 


Thesis  follows. 


A. 2.  Proof  of  the  error  bound  for  scattered  data 


Proof.  ( Proposition  3.1)  The  proof  replicates  the  argument  of  [18,  Theorem  4.11].  Recalling  the  definition  of 
Cn,xm  ,  we  have 


\utrue  -  vPAfr 


l2(n)  ^  cN,xM  (h2xM 


true  „.*\l|2  |  rd  11  true  „,*i|2 


U  ~  Uc 


\U  ~  Uc 


Then,  using  (49a)  and  (49b),  we  obtain 


.true  „,*l|2 


—  u}\\ L2(Q)  ^  CN,Xm  hJM  (  2  Iln^-L 


+  ^ )  +  hx..M  ( s  +  —mz±u 


which  is  the  thesis.  □ 

A. 3.  Proof  of  the  error  bound  for  fixed-design  regression 

We  first  introduce  some  notation  and  preliminary  definitions.  We  decompose  the  datum  y  as 

y  =  ytrue  +  e,  ytrue  =  [utrUe{xi),...,UtrUe{xM)l  e  =  [Cl,  .  .  .  ,  eM], 

and  we  define  eaug  =  G  RM+W .  We  observe  that  Var(eaug)  =  cr2E,  where  £  is  defined  in(31).  Then,  we 

introduce  the  solution  to  (2)  for  y  =  y*™e.  We  further  introduce  the  vectors  of  coefficients  u*,  u*,<T=0  € 

Rm+n, 

...  _  r  i*  l  ..—  _  i  »*'"°  l 


?*,cr= 0  ) 


associated  with  tt|  and  . 

We  have  now  the  elements  to  prove  Proposition  3.5. 

Proof.  ( Proposition  3.5)  We  observe  that 


|u|-u*’CT  °|||2(n)  =  (u*  -  U*’CT  °)T  M  (u*  —  u*’<T  °)  =  eTaug 


Then,  applying  [27,  Theorem  C,  Chapter  14.4],  we  find 


K-«r°iii»(„) 


=  a  trace 


(A^UAf1: 


Thesis  follows  by  observing  that 


E[||U|-U^|||2(n)]  <  2||U‘™e-^=°|||2(n)  +  2E  [||«|-«^=0||i2(n) 


and  then  combining  estimates  (30)  and  (50). 
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